20 research outputs found
Exploring Shape Embedding for Cloth-Changing Person Re-Identification via 2D-3D Correspondences
Cloth-Changing Person Re-Identification (CC-ReID) is a common and realistic
problem since fashion constantly changes over time and people's aesthetic
preferences are not set in stone. While most existing cloth-changing ReID
methods focus on learning cloth-agnostic identity representations from coarse
semantic cues (e.g. silhouettes and part segmentation maps), they neglect the
continuous shape distributions at the pixel level. In this paper, we propose
Continuous Surface Correspondence Learning (CSCL), a new shape embedding
paradigm for cloth-changing ReID. CSCL establishes continuous correspondences
between a 2D image plane and a canonical 3D body surface via pixel-to-vertex
classification, which naturally aligns a person image to the surface of a 3D
human model and simultaneously obtains pixel-wise surface embeddings. We
further extract fine-grained shape features from the learned surface embeddings
and then integrate them with global RGB features via a carefully designed
cross-modality fusion module. The shape embedding paradigm based on 2D-3D
correspondences remarkably enhances the model's global understanding of human
body shape. To promote the study of ReID under clothing change, we construct 3D
Dense Persons (DP3D), which is the first large-scale cloth-changing ReID
dataset that provides densely annotated 2D-3D correspondences and a precise 3D
mesh for each person image, while containing diverse cloth-changing cases over
all four seasons. Experiments on both cloth-changing and cloth-consistent ReID
benchmarks validate the effectiveness of our method.Comment: Accepted by ACM MM 202
PanopticNeRF-360: Panoramic 3D-to-2D Label Transfer in Urban Scenes
Training perception systems for self-driving cars requires substantial
annotations. However, manual labeling in 2D images is highly labor-intensive.
While existing datasets provide rich annotations for pre-recorded sequences,
they fall short in labeling rarely encountered viewpoints, potentially
hampering the generalization ability for perception models. In this paper, we
present PanopticNeRF-360, a novel approach that combines coarse 3D annotations
with noisy 2D semantic cues to generate consistent panoptic labels and
high-quality images from any viewpoint. Our key insight lies in exploiting the
complementarity of 3D and 2D priors to mutually enhance geometry and semantics.
Specifically, we propose to leverage noisy semantic and instance labels in both
3D and 2D spaces to guide geometry optimization. Simultaneously, the improved
geometry assists in filtering noise present in the 3D and 2D annotations by
merging them in 3D space via a learned semantic field. To further enhance
appearance, we combine MLP and hash grids to yield hybrid scene features,
striking a balance between high-frequency appearance and predominantly
contiguous semantics. Our experiments demonstrate PanopticNeRF-360's
state-of-the-art performance over existing label transfer methods on the
challenging urban scenes of the KITTI-360 dataset. Moreover, PanopticNeRF-360
enables omnidirectional rendering of high-fidelity, multi-view and
spatiotemporally consistent appearance, semantic and instance labels. We make
our code and data available at https://github.com/fuxiao0719/PanopticNeRFComment: Project page: http://fuxiao0719.github.io/projects/panopticnerf360/.
arXiv admin note: text overlap with arXiv:2203.1522
Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation
Large-scale training data with high-quality annotations is critical for
training semantic and instance segmentation models. Unfortunately, pixel-wise
annotation is labor-intensive and costly, raising the demand for more efficient
labeling strategies. In this work, we present a novel 3D-to-2D label transfer
method, Panoptic NeRF, which aims for obtaining per-pixel 2D semantic and
instance labels from easy-to-obtain coarse 3D bounding primitives. Our method
utilizes NeRF as a differentiable tool to unify coarse 3D annotations and 2D
semantic cues transferred from existing datasets. We demonstrate that this
combination allows for improved geometry guided by semantic information,
enabling rendering of accurate semantic maps across multiple views.
Furthermore, this fusion process resolves label ambiguity of the coarse 3D
annotations and filters noise in the 2D predictions. By inferring in 3D space
and rendering to 2D labels, our 2D semantic and instance labels are multi-view
consistent by design. Experimental results show that Panoptic NeRF outperforms
existing semantic and instance label transfer methods in terms of accuracy and
multi-view consistency on challenging urban scenes of the KITTI-360 dataset.Comment: Project page: https://fuxiao0719.github.io/projects/panopticnerf
A two-branch trade-off neural network for balanced scoring sleep stages on multiple cohorts
IntroductionAutomatic sleep staging is a classification process with severe class imbalance and suffers from instability of scoring stage N1. Decreased accuracy in classifying stage N1 significantly impacts the staging of individuals with sleep disorders. We aim to achieve automatic sleep staging with expert-level performance in both N1 stage and overall scoring.MethodsA neural network model combines an attention-based convolutional neural network and a classifier with two branches is developed. A transitive training strategy is employed to balance universal feature learning and contextual referencing. Parameter optimization and benchmark comparisons are conducted using a large-scale dataset, followed by evaluation on seven datasets in five cohorts.ResultsThe proposed model achieves an accuracy of 88.16%, Cohen’s kappa of 0.836, and MF1 score of 0.818 on the SHHS1 test set, also with comparable performance to human scorers in scoring stage N1. Incorporating multiple cohort data improves its performance. Notably, the model maintains high performance when applied to unseen datasets and patients with neurological or psychiatric disorders.DiscussionThe proposed algorithm demonstrates strong performance and generalizablility, and its direct transferability is noteworthy among similar studies on automated sleep staging. It is publicly available, which is conducive to expanding access to sleep-related analysis, especially those associated with neurological or psychiatric disorders
Residual Stress, Mechanical Properties, and Grain Morphology of Ti-6Al-4V Alloy Produced by Ultrasonic Impact Treatment Assisted Wire and Arc Additive Manufacturing
Ultrasonic Impact Treatment (UIT) is an effective technique for surface refinement and residual stress reduction, which is widely used in welding. This study investigates UIT-assisted Wire and Arc Additive manufacturing (WAAM). The residual stress, grain morphology and mechanical properties of post-UIT and as-deposited samples are studied. The result demonstrates that the UIT has a significant influence on the decrease of the residual stress. Moreover, the residual stress of the post-UIT samples is much lower than that of the as-deposited samples. The samples fabricated by UIT-assisted WAAM have a novel, bamboo-like distribution of prior-β grains, an alternating distribution of short columnar grains and equiaxed grains. The grain size of this bamboo-like structure is much smaller than the coarsen columnar grains. In addition, the mechanical properties of the post-UIT and as-deposited samples are compared. The results indicate that the average tensile strength of the post-UIT samples is higher, while the average elongation of the post-UIT samples is lower
Bio-Inspired Drug Delivery Systems: From Synthetic Polypeptide Vesicles to Outer Membrane Vesicles
Nanomedicine is a broad field that focuses on the development of nanocarriers to deliver specific drugs to targeted sites. A synthetic polypeptide is a kind of biomaterial composed of repeating amino acid units that are linked by peptide bonds. The multiplied amphiphilicity segment of the polypeptide could assemble to form polypeptide vesicles (PVs) under suitable conditions. Different from polypeptide vesicles, outer membrane vesicles (OMVs) are spherical buds of the outer membrane filled with periplasmic content, which commonly originate from Gram-negative bacteria. Owing to their biodegradability and excellent biocompatibility, both PVs and OMVs have been utilized as carriers in delivering drugs. In this review, we discuss the recent drug delivery research based on PVs and OMVs. These related topics are presented: (1) a brief introduction to the production methods for PVs and OMVs; (2) a thorough explanation of PV- and OMV-related applications in drug delivery including the vesicle design and biological assessment; (3) finally, we conclude with a discussion on perspectives and future challenges related to the drug delivery systems of PVs and OMVs
Bio-Inspired Drug Delivery Systems: From Synthetic Polypeptide Vesicles to Outer Membrane Vesicles
Nanomedicine is a broad field that focuses on the development of nanocarriers to deliver specific drugs to targeted sites. A synthetic polypeptide is a kind of biomaterial composed of repeating amino acid units that are linked by peptide bonds. The multiplied amphiphilicity segment of the polypeptide could assemble to form polypeptide vesicles (PVs) under suitable conditions. Different from polypeptide vesicles, outer membrane vesicles (OMVs) are spherical buds of the outer membrane filled with periplasmic content, which commonly originate from Gram-negative bacteria. Owing to their biodegradability and excellent biocompatibility, both PVs and OMVs have been utilized as carriers in delivering drugs. In this review, we discuss the recent drug delivery research based on PVs and OMVs. These related topics are presented: (1) a brief introduction to the production methods for PVs and OMVs; (2) a thorough explanation of PV- and OMV-related applications in drug delivery including the vesicle design and biological assessment; (3) finally, we conclude with a discussion on perspectives and future challenges related to the drug delivery systems of PVs and OMVs
Feature Pyramid Networks and Long Short-Term Memory for EEG Feature Map-Based Emotion Recognition
The original EEG data collected are the 1D sequence, which ignores spatial topology information; Feature Pyramid Networks (FPN) is better at small dimension target detection and insufficient feature extraction in the scale transformation than CNN. We propose a method of FPN and Long Short-Term Memory (FPN-LSTM) for EEG feature map-based emotion recognition. According to the spatial arrangement of brain electrodes, the Azimuth Equidistant Projection (AEP) is employed to generate the 2D EEG map, which preserves the spatial topology information; then, the average power, variance power, and standard deviation power of three frequency bands (α, β, and γ) are extracted as the feature data for the EEG feature map. BiCubic interpolation is employed to interpolate the blank pixel among the electrodes; the three frequency bands EEG feature maps are used as the G, R, and B channels to generate EEG feature maps. Then, we put forward the idea of distributing the weight proportion for channels, assign large weight to strong emotion correlation channels (AF3, F3, F7, FC5, and T7), and assign small weight to the others; the proposed FPN-LSTM is used on EEG feature maps for emotion recognition. The experiment results show that the proposed method can achieve Value and Arousal recognition rates of 90.05% and 90.84%, respectively
Feature Pyramid Networks and Long Short-Term Memory for EEG Feature Map-Based Emotion Recognition
The original EEG data collected are the 1D sequence, which ignores spatial topology information; Feature Pyramid Networks (FPN) is better at small dimension target detection and insufficient feature extraction in the scale transformation than CNN. We propose a method of FPN and Long Short-Term Memory (FPN-LSTM) for EEG feature map-based emotion recognition. According to the spatial arrangement of brain electrodes, the Azimuth Equidistant Projection (AEP) is employed to generate the 2D EEG map, which preserves the spatial topology information; then, the average power, variance power, and standard deviation power of three frequency bands (α, β, and γ) are extracted as the feature data for the EEG feature map. BiCubic interpolation is employed to interpolate the blank pixel among the electrodes; the three frequency bands EEG feature maps are used as the G, R, and B channels to generate EEG feature maps. Then, we put forward the idea of distributing the weight proportion for channels, assign large weight to strong emotion correlation channels (AF3, F3, F7, FC5, and T7), and assign small weight to the others; the proposed FPN-LSTM is used on EEG feature maps for emotion recognition. The experiment results show that the proposed method can achieve Value and Arousal recognition rates of 90.05% and 90.84%, respectively